DiscoverindexRGBHow We Recovered from a Major System Failure in 72 Hours | Pivots with Alex Palatnick
How We Recovered from a Major System Failure in 72 Hours | Pivots with Alex Palatnick

How We Recovered from a Major System Failure in 72 Hours | Pivots with Alex Palatnick

Update: 2025-04-21
Share

Description

Alex Palatnick shares a high-stakes story of a critical system failure at 51 Mines and the tough pivot his team had to make to recover. When an ISIS array started acting up, the solution required a full deletion and rebuild—an intense decision that had the entire team working tirelessly to restore operations within 72 hours. Alex reflects on the challenges, the importance of quick decision-making, and the lessons learned from navigating technical crises.Key Topics:The unexpected failure of an ISIS array and its impactHow the engineering team assessed the situationThe critical decision to delete and rebuild the systemThe importance of backing up data on LTOThe recovery process and lessons from the experienceQuotes:"The Pivot was as simple as making the decision—delete the array, bring it back up again, and start restoring everything.""Everybody was back working again within 72 hours, but it was ugly. Ugly. Wasn't any fun.""We took a real hard look at it, made sure all the media was backed up, and pulled the trigger on the fix."See more https://www.indexrgb.com/#TechRecovery #CrisisManagement #PivotAndRestore

Comments 
loading
In Channel
loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

How We Recovered from a Major System Failure in 72 Hours | Pivots with Alex Palatnick

How We Recovered from a Major System Failure in 72 Hours | Pivots with Alex Palatnick

/indexRGB